Finite State Transducers for Recognition and Generation of Compound Words
نویسندگان
چکیده
In this paper we present how finite state transducers can be effectively used for compound treatment in text analysis. The approach that we use is particularly well suited for text processing based on the usage of morphological electronic dictionaries and finite state technology. The results that we present do not aim to be comprehensive but rather illustrative of the power of possibilities, one of which is that compounds processed in the suggested way can be used in much the same way as simple words. Končni transduktorji za razpoznavanje in generiranje tvorjenk V prispevku pokažemo, kako lahko končne transduktorje učinkovito uporabljamo za obravnavanje zloženk pri analizi besedila. Pristop, ki ga uporabljamo, je posebej primeren za obdelovanje besedila na podlagi uporabe morfoloških elektronskih slovarjev in tehnologije končnih avtomatov. Predstavljeni rezultati niso izčrpni; njihov namen je namreč ponazoritev možnosti. Ena od teh možnosti je, da tvorjenke, ki so obdelane na predlagani način, lahko uporabljamo zelo podobno kot netvorjene besede.
منابع مشابه
Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملContinuous Speech Recognition Based on Deterministic Finite Automata Machine using Utterance and Pitch Verification
This paper introduces a set of acoustic modeling techniques for utterance verification (UV) based continuous speech recognition (CSR). Utterance verification in this work implies the ability to determine when portions of a hypothesized word string correspond to incorrectly decoded vocabulary words or out-of-vocabulary words that may appear in an utterance. This capability is implemented here as...
متن کاملOn the Road to Improved Lexical Confusability Metrics
Pronunciation modeling in automatic speech recognition systems has had mixed results in the past; one likely reason for poor performance is the increased confusability in the lexicon from adding new pronunciation variants. In this work, we propose a new framework for determining lexically confusable words based on inverted finite state transducers (FSTs); we also present experiments designed to...
متن کاملSpeech Recognition with Weighted Finite-state Transducers
This chapter describes a general representation and algorithmic framework for speech recognition based on weighted finite-state transducers. These transducers provide a common and natural representation for major components of speech recognition systems, including hidden Markov models (HMMs), context-dependency models, pronunciation dictionaries, statistical grammars, and word or phone lattices...
متن کاملTowards a Unified Framework
Conversational interfaces have received much attention as a promising natural communication channel between humans and computers. A typical conversational interface consists of three major systems: speech understanding, dialog management and spoken language generation. In such a conversational interface, speech recognition as the front-end of speech understanding remains to be one of the fundam...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006